WTIMIT: The TIMIT Speech Corpus Transmitted Over The 3G AMR Wideband Mobile Network
نویسندگان
چکیده
In anticipation of upcoming mobile telephony services with higher speech quality, a wideband (50 Hz to 7 kHz) mobile telephony derivative of TIMIT has been recorded called WTIMIT. It opens up various scientific investigations; e.g., on speech quality and intelligibility, as well as on wideband upgrades of network-side interactive voice response (IVR) systems with retrained or bandwidth-extended acoustic models for automatic speech recognition (ASR). Wideband telephony could enable network-side speech recognition applications such as remote dictation or spelling without the need of distributed speech recognition techniques. The WTIMIT corpus was transmitted via two prepared Nokia 6220 mobile phones over T-Mobile’s 3G wideband mobile network in The Hague, The Netherlands, employing the Adaptive Multirate Wideband (AMR-WB) speech codec. The paper presents observations of transmission effects and phoneme recognition experiments. It turns out that in the case of wideband telephony, server-side ASR should not be carried out by simply decimating received signals to 8 kHz and applying existent narrowband acoustic models. Nor do we recommend just simulating the AMR-WB codec for training of wideband acoustic models. Instead, real-world wideband telephony channel data (such as WTIMIT) provides the best training material for wideband IVR systems.
منابع مشابه
I-vector Speaker Verification for Speech Degraded by Narrowband and Wideband Channels
Voice biometrics are frequently exposed to channel degradations of transmitted speech and to channel mismatch between enrolment and test utterances, which cause speaker recognition systems to perform poorly. In this paper, the influence of channel bandwidth and speech coding on speaker verification is assessed employing the state-of-the-art i-vector technique. Our focus is on the possible benef...
متن کاملThe adaptive multirate wideband speech codec (AMR-WB)
This paper describes the Adaptive Multirate Wideband (AMR-WB) speech codec recently selected by the Third Generation Partnership Project (3GPP) for GSM and the third generation mobile communication WCDMA system for providing wideband speech services. The AMR-WB speech codec algorithm was selected in December 2000 and the corresponding specifications were approved in March 2001. The AMR-WB codec...
متن کاملrre STC-TIMIT: Generation of a Single-channel Telephone Corpus
This paper describes a new speech corpus, STC-TIMIT, and discusses the process of design, development and its distribution through LDC. The STC-TIMIT corpus is derived from the widely used TIMIT corpus by sending it through a real and single telephone channel. TIMIT is phonetically balanced, covers the dialectal diversity in continental USA and has been extensively used as a benchmark for speec...
متن کاملA candidate proposal for a 3GPP adaptive multi-rate wideband speech codec
This paper describes an adaptive multi-rate wideband (AMR-WB) speech codcc proposcd for the GSM system and also for the evolving Third Generation (3G) mobile speech services. The speech codec is based on SB-CELP (Splitband-Code-Excited Linear Prediction) with five modes operating bit rates from 24kbit/s down to S.lkbit/s. The respective channel coding schemes are based on RSC (Recursive Systema...
متن کاملAMR wideband codec - leap in mobile communication voice quality
The Third Generation Partnership Project (3GPP) and European Telecommunications Standards Institute (ETSI) have carried out development and standardisation of a wideband speech codec for GSM and the third generation mobile communication WCDMA system since 1999. The Adaptive Multi-Rate Wideband (AMR-WB) codec algorithm was selected in December 2000, and the corresponding specifications were appr...
متن کامل